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ABSTRACT 


Supermassive black hole binaries are one of the primary targets for gravita¬ 
tional wave searches nsing pnlsar timing arrays. Gravitational wave signals from 
snch systems are well represented by parametrized models, allowing the standard 
Generalized Likelihood Ratio Test (GLRT) to be nsed for their detection and es¬ 
timation. However, there is a dichotomy in how the GLRT can be implemented 
for pnlsar timing arrays: there are two possible ways in which one can split the 
set of signal parameters for semi-analytical and nnmerical extremization. The 
straightforward extension of the method nsed for continnous signals in gronnd- 
based gravitational wave searches, where the so-called pnlsar phase parameters 
are maximized nnmerically, was addressed in an earlier paper ( Wang et al.||2014 ). 
In this paper, we report the hrst stndy of the performance of the second approach 
where the pnlsar phases are maximized semi-analytically. This approach is scal¬ 
able since the nnmber of parameters left over for nnmerical optimization does 
not depend on the size of the pnlsar timing array. Onr results show that, for the 
same array size (9 pulsars), the new method performs somewhat worse in param¬ 
eter estimation, but not in detection, than the previous method where the pulsar 
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phases were maximized numerically. The origin of the performance discrepancy is 
likely to be in the ill-posedness that is intrinsic to any network analysis method. 
However, scalability of the new method allows the ill-posedness to be mitigated 
by simply adding more pulsars to the array. This is shown explicitly by taking a 
larger array of pulsars. 

Subject headings: pulsar timing array: general — continuous gravitational waves: 
detection algorithm 


Introduction 


Pulsar timing array (PTA) based gravitational wave (GW) search is a promising ap¬ 
proach for the very low frequency ( 10 ® — 10 ® Hz) regime (|Sazhin|[l978[ [Foster fc Backerj 


1990 Jenet et ah 2005), that is complimentary to the second-generation ground-based in¬ 


terferometers, such as Advanced LIGO (Waldman 2011), Advanced Virgo ( |Degallaix et ah 
2013), and KAGRA (Somiya 2012) operating at high frequencies (~ 10 — 10^ Hz), as well 


as to the space-based detectors, such as eLISA (Seoane et al. 2013) proposed for low fre¬ 
quencies (~ 10“^ — 10“^ Hz). Unlike man-made instruments, PTA uses a network of high 
precision astronomical clocks, i.e., millisecond pulsars (MSPs), as a galactic-scale GW de¬ 
tector. Gurrently, three regional PTAs (NANOGra\Q PPTAj^and EPTA|^ are operating at 
astrophysically interesting sensitivities that may lead to the detection of GWs in the near 
future. Shared data as well as collaborative and competitive efforts among individual PTAs 
bond them as the International Pulsar Timing Array (IPTApj [Manchester (2013); McLaugh¬ 
lin (2014)). The IPTA uses some of the most advanced radio telescopes in the world today to 


regularly monitor about 50 pulsars. Next generation radio telescopes with larger collecting 


areas and better backend systems, such as FAST (Hobbs et al. 2014) and SKA (Smits et al. 


2009), will join the global observation campaign in the future and push pulsar timing to 


higher precision and better detection sensitivities. 

A promising GW signal for PTA is the stochastic background formed by the incoherent 
superposition of weak signals from a large unresolved population of supermassive black hole 
binaries (SMBHBs) (Detweiler 1979; Romani & Taylor 1983| Foster & Backer p!990 ; Jaffe & 


^http: //www.nanograv.org/ 

^http://www.atnf.csiro.au/research/pulsar/ppta/ 
^http://www.epta.eu.org/ 

^http: / / www.ipta4gw.org/ 































































Backer||2003 ; Wyithe & Loeb 2003 Jenet et al. 2005). The stochastic GW perturbation will 
cause noise like signals in the pulsar time of arrivals (TOAs) that will be correlated across the 
pulsars in an array. The correlation will depend on the strength of the stochastic background 


and the pair-wise angular separation between the pulsars (Hellings & Downs 1983). Upper 


limits on the strength of the stochastic background along with our understanding of source 
population have been improving over the years in correspondence with improvements in data 
quality ( Jenet et ah]|2006 Yardley et al.||2011 van Haasteren et al.j2011 ; [Shannon et al.||2013 
Demorest et al. 2013). 


In addition to the stochastic background, there exists the interesting possibility of de¬ 


tecting GWs from individual SMBHB sources (Detweiler 1979 Lommen & Backer 2001 


Jenet et al. 

2004 

Seto 

2009 

). Simulations covering a range of massive black hole population 

models ( 

Sesana et al.||2009; 

Sesana & Vecchio||2010 

) have shown that on average at least one 


source may be resolvable against the stochastic background. In the past few years, interest in 


analyzing continuous GW signals from individual SMBHBs has increased considerably (Lee 


et al. 2011 Deng & Finn 2011 Mingarelli et al. 2012 Ravi et al. 2014). Gorrespondingly, 


searches for continuous signals in the recent PTA data have been conducted in parallel with 


the stochastic background (Yardley et al. 2010 Arzoumanian et al. 2014 Zhu et al. 2014) 


The detection and parameter estimation of continuous waves from individual sources in 


a PTA is a challenging data analysis task that has led to a number of studies ( 

Yardley et al. 

2010 Gorbin & Gornish||2010 Babak & Sesana||2012 Ellis et al.||2012 Ellis||2013i Wang et al. 

2014 Taylor et al. 

2014 

Zhu et al. 2015). Unlike ground and space based detectors, the 


analysis of PTA data must contend with irregularly sampled time series with possible gaps, 
and noise components that must be estimated along with the signal as well as components 
that may be non-Gaussian or non-stationary ( Wang fc et al.[20T5 ). As with any complex data 
analysis problem, a wide range of independent and complementary approaches are needed 
to build conhdence in the hnal results. 


This paper follows an earlier investigation reported in Wang et al. (2014) (hereafter 


WMJl), where a Generalized Likelihood Ratio Test (GLRT) (Kay 1998) was constructed 


along the line of existing continuous wave signal searches used for ground based detectors 


(Jaranowski et al. 1998 Gutler & Schutz 2005). The WMJl method explicitly includes the 


pulsar terms in the signal model and considers them as functions of pulsar phases. Numerical 
implementation of the GLRT usually involves a division of the signal parameters into the 
so-called extrinsic ones, over which the likelihood ratio can be maximized analytically or 
semi-analytically (including the use of Fast Fourier Transform), and the intrinsic ones for 
which a pure numerical optimization is required. However, unlike the ground based search, 
this division of the parameters into extrinsic and intrinsic is not unique in the case of a 
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PTA. Following the convention used for the J^-statistic ( Jaranowski fc Kr61ak||2012 ), WMJl 
explored the choice that takes the overall amplitude of the signal, the inclination angle 
between the binary orbit and the plane of the sky, and the polarization angle of the GW as 
extrinsic and treated the pulsar phases and remaining parameters as intrinsic. Our results 
showed that the pulsar phases are uninformative parameters, indicating that they are best 
marginalized or maximized as extrinsic parameters. The more important motivation to do 
so, however, is the fact that the number of pulsar phase parameters increases with the size 
of the PTA. Hence, the numerical optimization task will become infeasible at some point. 
Thus, the approach of treating pulsar phase parameters as intrinsic is not a scalable one. 

This paper presents the hrst implementation of a method based on treating the pulsar 
phases as extrinsic parameters in a GLRT. Although, the idea of semi-analytical maximiza¬ 


tion over pulsar phases was presented in Ellis et ah (2012), a concrete implementation and 


performance characterization of the resulting method has not been reported until now. The 
method proposed here retains the use of Particle Swarm Optimization (PSO) to handle 
the numerical optimization over intrinsic parameters, but the particular variant of the PSO 
meta-heuristic used in this paper is different. 

An alternative to maximization over the pulsar phases is to marginalize over them 
following a Bayesian framework. This is the approach that has been studied the most in the 
PTA literature so far (Ellis 2013 [Taylor et ah 2014). To enable meaningful comparisons, 
the performance of the method presented here is studied on simulated data corresponding 


to a PTA configuration adopted in Taylor et ah (2014). We use signal strengths, measured 


in terms of the network signal-to-noise ratio (S/N) p„, that span a wide range from strong 
{pn = 100) to moderate (p„ = 30) and barely detectable (p„ = 8). Although useful for testing 
the performance of the algorithm, > 20 is unrealistic for PTA based GW detection in 
the foreseeable future. Thus the performance of the method for = 8 to = 30 serves 
to bracket the scenario that is more likely. As in WMJl, we simulate a large number of 
independent data realizations and derive conventional Frequentist error estimates for the 
signal parameters. 

The results show that this method performs marginally better than the method in WMJl 
for detection, but the estimation of the angular parameters is somewhat worse. Specifically, 
while the localization of sources in WMJl and the Bayesian method are comparable, shifting 
to a different split of extrinsic and intrinsic parameters creates secondary maxima that 
increase the scatter of estimated source locations. This is most likely the result of the well 
known ill-posedness of the GW network analysis problem ( Klimenko et al.||2005 Rakhmanov 
2006 Mohanty et al.||2006 ). Ill-posedness in inverse problems, such as GW network analysis, 
is marked by instability or discontinuity of the inferred solution under small perturbations 
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in the data. The source of perturbation can be either the noise in the data or numerical 
errors from computations. The jumping of solutions to radically different values can manifest 
itself as a large bias or large variance in estimation. For strictly linear models, such as GW 
burst searches where the time samples of the two polarization waveforms directly form the 


parameters to be estimated (Rakhmanov 2006), ill-posedness is easily seen to be rooted in 


the rank deficiency of the matrix A^A, where A is the m x 2 network response matrix (m is 
the number of detectors). The origin of ill-posedness in parameter estimation presented in 
this work is not as straightforward because the signal model is nonlinear in the parameters. 

Mitigation of ill-posedness requires regularization in some form, such as the imposition 
of constraints on the GLRT solutions ( Greville]|1959 Tikhonov fc Arsenin||1977 ). While some 
constraints appear naturally in the implementation of GLRT in WMJl, they are absent in 
the formulation of the method presented here. The effects of ill-posedness are reduced, 
in general, by increasing the number of differently oriented detectors in a network. We 
demonstrate this by considering the case of a PTA with 17 pulsars. For this reason, we do 
not go deeper into the issue of regularization for PTA in this paper but leave it for future 
work to address. 

The rest of the paper is organized as follows. In Section we introduce the data model 
used in this paper. Section [^describes the GLRT for this data model and its implementation, 
which involves maximization over pulsar phases analytically by solving quartic equations. 
Section characterizes the method using simulated data and compares its performance with 
WMJl and Taylor et ah (2014). The paper is concluded in Section]^ Some details about 
solving the quartic equation have been relegated to Appendix A. 


2. Data model 

The data used for GW signal detection and parameter estimation in the case of a PTA 
consists of a set of timing residuals = (r(, r^,..., r^^), / = 1,2, ...,Ap, where Np is 
the number of pulsars, Nj is the number of observation for the J-th pulsar. Each timing 
residual is associated with a time of observation tl G [0,T], > ff. The time interval 

between two observations can vary typically from several days up to a few weeks. When 
there is a signal in the data, ^^ = 5^4- otherwise, = n^. Here = (n(,n 2 ,... 
and = (s(, S 2 ,..., denote the noise realization and the GW signal respectively. The 
models for the signal and the noise (zero mean stationary Gaussian) remain the same as in 
WMJl, but it is convenient to express the signal in a functional form that allows the pulsar 
phases to be easily extracted as extrinsic parameters in the detection statistic. 
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GWs perturb the proper distance between a pulsar and an observer on the Earth, causing 
fluctuations of the TOAs of radio pulses with time. In the TT-gauge associated with a plane 
GW, the perturbation in the metric tensor can be written as 

h=(h+e+ + hxex)e*(--*-‘^'"), (1) 

where Ug^ is the GW angular frequency, k is the GW wave vector, and 

e+ = Q:(8)Q: — (2a) 

ex = q:( 8)5 + 5G)Q:. (2b) 

CK and 5 are the unit vectors along right ascension and declination in equatorial coordinates. 
The response of the detector to the GW is given by 

s' (A) = q(a, i)Ak+(t'; 0) + (a, A) Afc, (i'; 9), (3) 

where and are the antenna pattern functions (defined in Equations 9 and 10 of 
WMJl), a and 6 are the right ascension and declination of the source, 6 represents collectively 
the following parameters: (i) the overall amplitude factor (defined in Equation 7 of WMJl); 
(ii) t, the inclination angle between the binary orbital plane and the plane of the sky; (hi) 
Ip, the GW polarization angle; (iv) cpo, the initial phase of the binary at the beginning of 
the observation; (v) parameter ifj = — }^ujg^d\l — cos6*^), the pulsar phase parameter 

that contains the distance from the pulsar to Earth and the open angle 6^ between the 
lines of sight to the pulsar and the GW source. Hereafter, we regard the pulsar phases 
as independent variables. A = {a, 6} U 6 denotes the set of all the parameters. The term 
Ah+^x(tl;9) is the difference of the GW tensor at Earth and at the pulsar at the observer’s 
time ff, 

Ah+^xitl;0) = h+,x(t-;0)-/i+,x(t{-r^;0), (4) 

where = d\l — cos6^)/c is the time delay of the plane GWs of the same phase arriving 
at Earth and at the pulsar. Hereafter, we assume that the binary system is evolving slowly, 
so that in the signal model the orbital frequency in the pulsar term remains approximately 
the same as in the Earth term. 

The GW signal can be written as 

si = 2({1 + cos'^ L){Flcos2p: - Flsm2'ip)sm{(po - <~Pi)sm{ipo + ipi + ^(tl)) 

— cos i{Fl sin 2'^ + F^ cos 2-^) sin((^o — ^i) cos{(po + p>i + ^(tl)) 

= Aism{Lpo-ipi)sm{(po + p>i + (pi + ^{tl)), ( 5 ) 

where <h(ff) = 

Aj = 4C{l + cos\f {Fl cos 2ij - Fl sin 2ipf 
+ 16C^ cos^ sin2-^ + cos2'^)^ , 


( 6 ) 
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and 

—2 cos L Fl sin 2'<p + cos 2 -?/’ 

^ 1 + cos^ i Fl cos 2-^ — Fx sin 2-^ 

Here and 0/ depend only on C, t, -0, a, 5. In Equation we can isolate the ipj dependence 


and get 

si = (Bi - Si) cos 2pi + (Cj + Vj) sin 2(p/ + {Bi + Sj ), 

( 8 ) 

where 

= ^Ai sin ipQsm{ipo + (fi + ^{tl)), 

( 9 ) 


Ciiti) = -^Al/cos(posin((^o + (fi + ^{tl)), 

( 10 ) 


= ^Al7sin(^ocos((po Fcfi + ^itl)), 

( 11 ) 


^il'^l) = -^Acos</9ocos(<y9o + 0/ + <h(t-)) • 

( 12 ) 


Note that Bj, Cj, Vj, Sj are functions of time and the source parameters rather than (fj. 


3. Generalized Likelihood Ratio Test 


In the Frequentist approach, the detection of GW signals presents a composite hypothe¬ 
ses test problem: Given data r, we need to pick one among a family of hypotheses about the 
joint probability density function (pdf) from which r is obtained. Under the null hypothesis 
Tio, the data does not contain any GW signal and the pdf, p(r), governing r is that of the 
noise alone. Under the alternative hypothesis Hx, a GW signal s(A) with parameters A is 
present in r and the data is a realization from a governing pdf of the form p(r| A) = p(r—s(A)). 
In a GLRT, assuming that the PDF of the noise p{r) is known, the test statistic 

pfrlA) ^ 

GLRT(r) = max ^ = max LR(r; A) = LR(r; A), (13) 

A p(r) A 


is compared with a threshold to decide in favor of "Ho or "Hy. Here, LR(r; A) is the likeli¬ 
hood ratio for a given hypothesis and A is the Maximum Likelihood Estimate (MLE) of the 
parameters. The maximizer. A, of LR(r; A) in Eq. 13 is the same as that of any monotonic 


function of LR(r;A). Its logarithm, A(r;A), is one such convenient choice for the case of 
Gaussian noise. 


Unlike the case of a known A, where the optimal test statistic (under the Neyman- 
Pearson criterion) is known to be LR(r;A), there is no proof of optimality associated with 
the GLRT except in some simple cases. However, it has been shown that it is the uniformly 






most powerful (UMP) among all invariant tests (Lehmann 1959). In practice, and when it 
is computationally feasible, the GLRT is often found to be superior to other ad hoc tests. 


3.1. The network likelihood ratio 


For a PTA of Ah pulsars, the log-likelihood ratio is 


Np 


A(r;A) = > A7(r;A), 


1=1 


A,(r;A) = y\s\X)),-f-{s\\)\s\\))j, 


(14) 


where ('I')/ is the noise weighted inner product, (■)C7^(-)^, with C/ being the covariance 
matrix of the noise process in the J-th pulsar. It is assumed here that the cross-covariances 
of noise between and are ignorable for I ^ J. Inserting Eq. into Eq. 14 we have 


A/(r;A) = [{r^\Xi)j cos 2(pj + {r^Yj)j sin 2 lpj + {r^\Zi)i 

— ^ [{Xi\Xj)I cos'^ 2ipi + {Yi\Yi)I sin^ 2ipi + 2{Xj\Yi)j sin2ipi cos2(p/ 

-|-2(X/|Z/)/cos2(^9/-|-2(y}|Z/)/sin 2(^9/-|-(Z/jZ/)/ )] , (15) 


where Xj = Bj — Sj, Yj = Ci + Vj, and Zj = Bj + £i- 


The calculation of the GLRT can be seen as a nested maximization problem, 

GLRT(r) = maxmax A(r; A) . (16) 

Ai Ae 

This split is meant to indicate that the whole model parameter set A can be divided into 
disjoint subsets classihed as extrinsic (inner maximization) Ae = and intrinsic (outer 

maximization) A* = {a, 5, C, t,-0, V^o}. Usually, the separation is made such that the 
former can be maximized using analytical (or semi-analytical) methods, while the latter 
requires computationally expensive numerical optimization. It should be emphasized that 
the classification of parameters as extrinsic (computationally trivial) and intrinsic (compu¬ 
tationally non-trivial) pertains to their role in the numerical procedure adopted for their 
estimation rather than their role in dehning the the astrophysical signal. 


3.2. Maximization over extrinsic parameters 


The inner maximization of the GLRT over the extrinsic parameters (Eq. 16) leads to, 
c{ sin 2ipi + C 2 cos 2(pi -|- Cg sin 4^9/ -f- cos 4lpj = 0 , (17) 
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where 


where 



(18a) 


(18b) 

{{Xi\Xj)i-{Yj\Yj)i), 

(18c) 

■{Xi\Yj)i. 

(18d) 

:an be transformed into a set of Np 

quartic equations 

by^ + cy^ + dy + e = Q 

(19) 

4(c^ + cD , 

( 20 a) 

4(CiC3 -f C2C4) , 

( 20 b) 

c\ + cl- 4(c^ + cl ), 

( 20 c) 

- 4 ciC 3 - 2 C 2 C 4 , 

( 20 d) 

2 2 

C4 - Cl . 

( 20 e) 


Here, we have suppressed the pulsar index I in Eq. 19 and 20 for clarity. 


A convenient numerical algorithm for solving quartic equations involves computing the 


eigenvalues of the 4x4 companion matrix (Press et ah 2002) 

/ —k _£ \ 

a a a a * 
10 0 0 

0 10 0 

\ 0 0 1 0 y 


D 


( 21 ) 


It is an upper Hessenberg matrix, for which the characteristic polynomial is Equation (19) 


with y as the eigenvalue. Hence, the set of its eigenvalues constitute the roots of the quartic 
equation. 

It is possible to get multiple real solutions (two or four) depending on the coefficients 


in Eq. Out of these solutions, we first select the ones whose absolute value is less than 
unity (since y = cos(2(p/)) and then select the one for which A/ is greatest. This ensures 
that the solutions for (p/ found above also maximize the network log-likelihood ratio since it 
is just the sum over A/. 


If no valid solution is found, then there is no turning point for A/ in Eq. For this 
case, the maximum of A/ will appear at the boundary of the allowed region, i.e., y = 1 
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{ifj = 0) and y = —1 {ipj = t^/2). We then evalnate A/ at the bonndary points and pick the 
one that gives the largest valne. 


3.3. Maximization over intrinsic parameters 


The onter maximization of the GLRT over the intrinsic parameters (Eq. [I 6 | reqnires 
a search for the global maximnm over the remaining 7-D intrinsic parameter space \i = 
{a, 6, cUgw, C) ^5 V') T’o}- This fnnction is highly mnlti-modal dne to the presence of noise in the 
data and degeneracies among the parameters. Deterministic local optimization fails to locate 
the global optimnm in snch a case and a brnte force grid search is compntationally prohibitive 
for snch a large nnmber of parameters. The only feasible approach is to nse algorithms that 
employ some type of a stochastic search scheme. As demonstrated in WMJl, Particle Swarm 
Optimization (PSO) ( |Eberhart fc Kennedy||1995 Wang fc Mohanty]|2010 Mohanty]|2012a|[b ) 
provides a relatively straightforward approach to snccessfnlly addressing this problem. 


PSO searches for the global optimnm of a given fitness fnnction nsing a stochastic 
sampling scheme. The sample points, called “particles”, are iteratively displaced according 
to the PSO dynamical eqnations. Relevant details of the PSO algorithm are provided in 
WMJl. Since the present optimization problem has a mnch lower dimensionality, one wonld 
expect that the same PSO algorithm as nsed in WMJl wonld work here too. However, onr 
initial tests showed that some tweaks were needed to achieve satisfactory performance. In 
order to describe these modifications, let ns first recapitnlate the PSO dynamical eqnations. 

Let f{x) be a fitness fnnction (i.e. the log likelihood ratio A(r; A) in onr case), where 
X G S' C M"' and S is called the search space and it is generally assnmed to be a hypercnbe, 
S = [a, 6 ] 0 [a, 6 ] ® ... (g) [a, h]. Let i = 1 , 2 ,..., A^part, be the position of the particle 

in a swarm of A^part particles at the iteration step k. The coordinates corresponding to Xj(fc) 
are (xj^i(A;),..., Xj^„(/c)). Associated with each particle is the location, pfik), called pbest 
(“particle best”), where the best fitness was fonnd in its history. 


fiPiik)) = , max fixfij)) . ( 22 ) 

j = k , k — l ,...,0 

Associated with the swarm is the location, g{k), called ghest (“global best”), where the best 
fitness was fonnd by the swarm. 


max 

j = l,...,7Vpart 




figm 


(23) 













Given Xi{k), Pi{k) and g{k), the following equations are used to evolve the swarm. 


(24) 

(25) 

(26) 


Xi{k + 1 ) 
Vij{k + 1 ) 

Viik + 1 ) 


Xi{k) + Vi{k); 

min (max (?/ij(A; + 1),-Umax) ,t^max) , 

w{k)vi{k) + mi^i{pi{k) - Xi{k)) + ini^ 2 {gik) - Xi{k)) , 


Randomness in the sampling is introduced through nijp, p = 1,2, a diagonal matrix, 
diag(mp,j 4 ,... ,mp^i^n), such that rup^i^k ~ G[0,Cp] is drawn from a uniform distribution over 
[0, Cp]. The parameters Cp, p = 1, 2 and the prescribed deterministic sequence w{k) determine 
the extent to which continuing exploration of the search space is balanced by exploitation 
and focussing of the search around a good value at a given iteration step. We set w{k) to 
be a linearly decaying sequence starting at 0.9 and ending at 0.4 at termination. At the 
termination of PSO, the highest fitness value found by the swarm, and the location of the 
particle with that fitness, make up the solution to the optimization problem. 

As in WMJl, we use a modified form of the above iteration equations where gbest is 
replaced by the best location. West, in a local neighborhood of each particle. We use the 
ring topology to determine the neighborhoods: the particle indices are put on a circle and 
the neighborhood of each particle consists of (m — l)/2 particles on each side with m being 
the user specihed size of each neighborhood. 


The settings for the parameters of the PSO algorithm outlined above are retained from 
WMJl: A^part = 40, Cl = C 2 = 2.0, m = 3, Umax = (& “ a)/5, n(aax = (p - «)/2, w{k) = 
0.9 — 0.5(/c/(Wter — 1)), where Wter = 2000 is the total number of iterations. In addition to 
fixing the PSO parameters, the behavior of particles crossing the boundary of S is handled 
using the “let them fly” boundary condition in which the fitness of the particle is simply 
set to —oo while it is outside S. The main modification to the PSO algorithm in this 
paper is the introduction of a local optimization of the gbest position, using the Nelder-Mead 
algorithm (Press et ah 2002), that is performed only when gbest changes. We believe that 
the maximization over the pulsar phases leaves behind a htness function that has ridge-like 
features in it. This expectation is based on similar behavior of the htness function, after 
initial phase maximization, in the case of compact binary inspiral signals for ground-based 
searches. The use of local optimization then moves the gbest location along these long ridges 
to better values more efficiently than a pure random move. However, a systematic study of 
these ideas is postponed to a future work. 


To increase the probability of successfully converging to within a sufficiently small neigh¬ 
borhood of the global maximum, multiple independent runs of PSO are made on the same 
data segment. Being mutually independent, these runs can be executed using simple paral¬ 
lelization on a multi-processor machine. Unlike the case of WMJl, where the computational 
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cost of evaluating the fitness function was relatively higher and only one run of PSO per data 
realization was feasible, we are able to execute 8 independent runs for each data realization 
in the present case. 

For simulated data, it is possible to gauge successful convergence to the global maximum 
by comparing the best fitness found with its value at the true signal location: the former 
should always be higher than the latter. This test is passed by PSO in all the cases discussed 
in the next section. 


4. Applications 


We illustrate the above algorithm (hereafter referred to as MaxPhase) using simulated 
data corresponding to a PTA configuration adopted in Taylor et ah ( 2014[ ) (see the paper 
and the references therein for ephemerides of the nine pulsars in the network). In all of 
the cases considered below, the source is a SMBHB in a circular orbit which is located 
at Right Ascension a = 1.0 rad (3 hr 49 min) and declination 6 = 0.5 rad (28°.7). The 
orbital angular angular frequency u = 1.96 rad yr“^ (cugw = 3.93 rad yr“^), the initial phase 
<Po = 2.89 rad (165°.6), the inclination angle t = 0.5 rad (28°.6) and the polarization angle 
"0 = 0.5 rad (28°.6) are also set to be the same values as in Taylor et ah (2014). The span 


of the simulated observation is 14.9 years, with uniform biweekly cadence leading to the 
same number of samples Nj = 389 for each pulsar. The signal induced by this GW source 
is calculated for each pulsar in the PTA following Eq. Independent realizations of white 
Gaussian noise are added to the signal for each pulsar, with the noise standard deviation 
for a given pulsar set equal to its timing residual rms (we used the same level of noise as in 
WMJl). To characterize the strength of the signal in the data, we use the network SNR of 
the signal defined as 


N„ 


1/2 


Pn — 




Np Nj ^ j \ 2^ 


1/2 


, 7=1 


,/=l k=l 




(27) 


We choose pn = 100, 30, 8 to represent the strong, moderate, and weak signal scenarios 
respectively. For each scenario, 200 independent realizations of data are generated. The 
results from each scenario are discussed in the following sub-sections. Although not required 
from the point of view of signal analysis, these S/N choices can be associated with astrophys- 
ical parameters for concreteness. For example, the S/N values above in descending order 
could arise from a SMBHB system that has a chirp mass A4c ~ 10® Mq, an orbital period of 
P = 3.2 yrs, and that is located at a distance approximately 10, 30 and 125 Mpc from Earth, 
respectively. As already mentioned in Sec. we ignore the evolution of the binary orbital 
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frequency, which is a reasonable assumption for the purpose of studying the performance of 
the algorrithm, although this assumption can become invalid in the late stage of the SMBHB 
evolution. 

Fig. [^compares the log likelihood ratio found by the MaxPhase algorithm with its value 
for the true signal parameters. For each of the three network S/N, we can see that the former 
is greater than the latter for all realizations. This is the least one expects from any viable 
estimation algorithm and we see that the MaxPhase algorithm passes this basic test. 

To obtain the threshold for detection or to set upper limits, the distribution of the detec¬ 
tion statistic under I-Lq is required. This involves hnding the distribution of the maximum of 
the log likelihood ratio A. We use Monte-Carlo simulation with 500 independent noise-only 
realizations of data to directly estimate this distribution. Figure shows the distributions 
of the detection statistic GLRT(r) under the noise-only case and under the three different 
signal scenarios. The histograms for the "Ho and Pn = 8 cases can be fitted well by the Log- 
Normal distribution In u). The distribution converges to a normal distribution 7V(/i, a) 
as the signal strength increases. 
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X 10 




Fig. 1.— Histograms of the detection statistic normalized by the total nnmber of trials. The 
histogram in the npper left panel is for the Ho case; the histogram in the npper right panel 
is for pn = 8 case; the histogram in lower left panel is for the pn = 30 case; the histogram 
in the lower right panel is for the pn = 100 case. The red cnrve in the each panel shows 
the best fit distribntion. These are \nN'{p = 2.89, a = 0.12), \nAf{p = 3.67, a = 0.23), 
A7(/i = 463.7, a = 31.2), and A/'(/i = 5055.3, a = 95.8) respectively. 


4.1. Strong signal 

In this case, the network S/N pn = 100. Fignre shows a typical realization of the 
simnlated timing residnals for the nine pnlsars (thin gray line). The magnitnde and the phase 
of the noise-free timing residnal (black dashed line) depend on the location and distance of 
the sonrce and pulsar in the array. In this strong signal scenario, the signal in most of the 
pulsars is comparable to or even stronger than the respective noise. The reconstructed signal 
is obtained by Equation or in which the input extrinsic and intrinsic parameters are the 
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p =100 p =30 p =8 





Fig. 2.— Log likelihood ratio values obtained from MaxPhase v.s. Log likelihood ratio for 
the true signal. From left to right, the panels correspond to the network S/N = 100, 30, 
8 scenarios, respectively. There are 200 data realizations for each scenario. 


ones estimated by MaxPhase. 

As seen from Figure the estimated signal is indistinguishable from the injected one 
for all the pulsars except PSR J1744-1134 (separation angle is 150°) which contains the 
weakest signal and contributes insignihcantly to the detection statistic. Figure [T] shows 
the distribution of the detection statistic values under the null ("Ho) and alternative ("Ha) 
hypotheses. Comparing the distributions for null and = 100, it is clear that the detection 
probability Qd is nearly unity if the threshold for claiming a detection is chosen as the highest 
value for the null case. Since we have used 500 realizations for Tio, the false alarm probability 
for this choice is approximately 2 x 10“^. 

Figure 1^ shows the distributions of estimated parameters {a, 5, C, t,-0, cUgw} that are 
astrophysically interesting. The distributions were estimated from 200 independent data 
realizations. For the sky localization, most of the estimated locations are very close to the 
true one. However, for 43 out of the 200 realizations, the estimated locations appear to 
fall on some secondary maxima located along an arc. Similarly, the scatter for t and 0 is 
larger than expected. In contrast, the true values of and ( are well within the one-sigma 
uncertainty of 2.9 x 10“^ rad-yr“^ and 6.11 x 10“^ sec, respectively, calculated from the 200 
realizations. 
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PSRJ0030+0451 


PSRJ0437-4715 


PSR J1640+2224 
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PSR J1713+0747 



X 10''^ 
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PSR J1909-3744 
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PSR J1744-1134 
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X 10'^ 
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PSR J2317+1439 



Fig. 3.— Data realization showing the simulated timing residuals (thin gray line) and signal 
(dash black line) for all pulsars. The network S/N pn = 100. The reconstructed signals 
are shown as solid curves. For most pulsars, except J1744-1134, the true and reconstructed 
signal are almost indistinguishable from each other. For PSR J1744-1134, we have zoomed 
into the noise so that the signal can be seen clearly. 
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Fig. 4.— Two-dimensional scatter plots (top) and histograms (bottom) of estimated param¬ 
eters for network S/N = 100. The star and the red vertical line mark the true values of 
the parameters. The dashed vertical line marks the mean value, and the shaded area covers 
the one-sigma region around the mean. The total number of trials is 200. 


4.2. Moderate signal 

Figure shows a realization of the simulated data for a network S/N = 30. The 
noise is now seen to be stronger than the signal in most of the pulsars. The recovered 
signal continues to agree with the injected one quite well. Note that for PSR J1744-1134, 
J1713-I-0747 and J1640-I-2224, the deviation from the true signals is mainly in the amplitude, 
while the offset in phase is not significant. From the distribution of the detection statistic 
in Fig. the detection probability is still practically unity for a detection threshold with an 
approximate false alarm probability of 2 x 10“^. In Figure we see more clearly that the 
sky locations are centered on the same secondary maxima as in the pn = 100 case (Fig|^ 
but with an increased scatter around each. We also note that the bias in the estimation of 
the inclination and polarization angle is increased. The one-sigma uncertainties for cugw and 
C increase to 0.01 rad ■ yr“^ and 8.63 x 10“^ sec, respectively. The increase in the errors is 
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roughly consistent with their expected linear dependence on network S/N. 
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PSR J2317+1439 



Fig. 5.— Data realization showing the simulated timing residuals (thin gray line) and signal 
(dash black line) for all pulsars. The network S/N is pn = 30. The reconstructed signals are 
shown as solid curves. For some pulsars, such as PSR J1744-1134 and J1857+0943, we have 
zoomed into the noise in the subplots, so that the signal can be seen clearly. 
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Fig. 6.— Two-dimensional scatter plots (top) and histograms (bottom) of estimated param¬ 
eters for network S/N = 30. The star and the red vertical line mark the true values of 
the parameters. The dashed vertical line marks the mean value, and the shaded area covers 
the one-sigma region around the mean. The total number of trials is 200. 


4.3. Weak signal 


In this case, the network S/N = 8 corresponds to a weak and barely detectable 
signal. This is also the network S/N used in Taylor et ah (2014). Figure shows one of 
the realizations of the simulated timing residuals. In this scenario, the noise dominates the 
signal in all pulsars. This illustrates the most likely situation with the current level of timing 
precision obtained in pulsar timing arrays. Even though the noise is loud, the recovered 
signals have deviations mainly in the amplitude (usually biased towards a larger value), 
while the offset in the phase is tolerable. In Figure]^ the scatter of the sky location becomes 
larger, but the presence of secondary maxima seen in the previous cases is still discernible. 
However, now the true location attracts the least number of trial values. The bias in the 
estimation of the inclination and polarization angle is now much clearer. The uncertainties in 
cugw and ( are 0.036 rad-yr“^ and 2.11 x 10“® sec, again roughly consistent with the expected 
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linear dependence on network S/N. From Figure the detection probability is Qd ~ 0.86 if 
we choose the detection threshold to be the largest value of the noise-only distribution. In 
this case, the signal is still large enough to be detected, although it cannot be localized at 

all. 


PSRJ0030+0451 
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Fig. 7.— Data realization showing the simulated timing residuals (thin gray line) and signal 
(dash black line) for all pulsars. The network S/N is pn = 8. The reconstructed signals are 
shown as solid curves. For most pulsars, we have zoomed into the noise in the subplots, so 
that the signal can be manifested. 
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Fig. 8.— Two-dimensional scatter plots (top) and histograms (bottom) of estimated param¬ 
eters for network S/N = 8. The star and the red vertical line mark the true values of the 
parameters. The dashed vertical line marks the mean value, and the shaded area covers the 
one-sigma region around the mean. The total number of trials is 200. 


4.4. Comparison with other algorithms 


In Figure]^ we show the log likelihood ratio from MaxPhase versus those from WMJl 
for a subset of 100 data realizations chosen randomly from the set used for the simulations 
reported above. We can see that for most realizations in each of the three signal strength 
scenarios, the former can find a marginally larger (better) log likelihood ratio than the latter, 
which suggests that MaxPhase can achieve a greater detection probability than WMJl for 
a given detection threshold. 


Comparing parameter estimation performance. Figure 10 gives the estimated sky lo¬ 
cations from WMJl. For the = 100 case, the sky localization is very similar to the 
corresponding one in Figure except that there are no secondary maxima. With the de¬ 
creasing of pn to 30 and 8, the sky localization scatter increases but it still appears uni-modal 
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and concentrated aronnd the true value. 


Comparing our results for MaxPhase and WMJl with those of the Bayesian method 
(Taylor et ah |2014 ), we make the following observations. From the receiver operating char¬ 


acteristics (ROC) curve reported in Fig. 6 of Taylor et ah (2014), the detection probability 
appears to be close to unity for = 8 case at the lowest false alarm probability of 0.01 
used in that paper. The corresponding detection probability from MaxPhase is 97.5% (and 
rapidly approaches unity for higher FAP). Thus, the detection performance of MaxPhase is 
comparable to that of the Bayesian method. The distribution of the estimated parameters in 
the Frequentist case can be compared more reliably with the distribution of the maximum-a- 
posteriori value of the parameters in the Bayesian method. We have picked the same source 


parameters as in Taylor et ah (2014), so the comparison is straightforward. Although there 


are differences between the two analyses, such as the use of irregularly versus regularly sam¬ 
pled data, they should not impact the comparison too much. From Figure]^ (MaxPhase), 


Figure |10| (WMJl) in this paper and Figure 5 (Bayesian) in 

^'aylor et al. 

(2014), we see 

that for Pn = 8 case (the only case considered in Taylor et ah 

(2014)), the estimated sky 


location by MaxPhase is inferior to the Bayesian method, while the results from WMJl and 
the Bayesian method are qualitatively comparable. 

Regarding computational costs, MaxPhase takes 6.7 min on average to complete one 
PSO run for each data realization on a single processor core, while the WMJl algorithm 
takes 89 min. As far as obtaining point estimates of the signal parameters is concerned, the 
reported computational cost of the Bayesian algorithm appears to be significantly higher 
than either of the Frequentist methods. For example, 48 cores are used in Taylor et ah 


(2014 

) to run a parallelized implementation of the MultiNest algorithm ( 

Feroz et al. 

2009) 


and the analysis is reported to typically take up to 45 minutes to complete at a network 
S/N pn = 10. However, it should be noted that the Bayesian method also maps out the 
posterior probability distribution of parameters, which may provide useful information in 
an analysis. Interestingly, it has been demonstrated in the context of CMB analysis that 
a htting procedure may be combined with PSO to map out the likelihood function locally 
around the point estimate ( Prasad fc Souradeep||2012 ). Thus, it may be possible to similarly 
extend MaxPhase (or WMJl) to obtain information similar to that of a Bayesian method. 
This will lead to a corresponding increase in the computational cost of MaxPhase. 





















COS(7i/ 2 5) Log-LR from MaxPhase 


- 23 - 

Pn=100 p =30 p =8 





Fig. 9.— In each panel, Log likelihood ratio values from MaxPhase algorithm v.s. WMJl 
algorithm are shown for three scenarios with = 100, 30, and 8, respectively. The number 
of independent data realization is 100. In almost all trials, the log-likelihood ratios are seen 
to be higher for the MaxPhase algorithm. 
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Fig. 10.— In each panel, blue circles show the estimated sky locations for the source, which 
are obtained from the WMJl algorithm for a PTA consisted of 9 pulsars. A red star marks 
the true location of the source used in the simulation. The x-axis represents Right Ascension 
and the y-axis represents the declination. The total number of independent data realization 
is 100. The panel on the right may be compared with Fig. 5 of Taylor et ah (2014). 


4.5. Effect of increasing the PTA size 

As we noticed in the strong signal scenario, the maximization over the pulsar phases 
leaves behind a log-likelihood ratio that has strong secondary maxima, a feature that is 
absent if the pulsar phases are treated as intrinsic parameters. If these secondary maxima 
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are comparable to the global maximum in value, they become attractors for stochastic search 
algorithms and reduce their effectiveness in locating the global maximum. With a decrease 
in signal-to-noise ratio, the probability of the locations of such secondary maxima becoming 
the global maximum increases. Both these effects worsen parameter estimation as we see in 
the moderate and weak signal cases. 


This situation can be substentially improved by adding more pulsars in a PTA. Unlike 
the ground and space borne laser interferometers, adding more detectors (millisecond pulsars) 
in a PTA is technically easier and cheaper in terms of costs. Here, we demonstrate this by 
using the NANOGrav conhguration ( Demorest et al.|2013 ) which consists of 17 pulsars in the 
catalog. We keep the network S/N the same for each scenario as in the analysis reported 
in Sec. 4.1 - 4. 4| with 9 pulsars. Accordingly, the overall amplitude ( of the GW is scaled 
down. This implies that the signal amplitude for individual pulsars becomes significantly 
lower. 


Fig. [TT] presents the estimations of Right Ascension and declination of the GW source 
for 100 independent data realizations with = 100, 30 and 8 cases. Glearly, the scatter 
and the secondary maxima in the sky localization are effectively suppressed comparing to 
the ones in Fig. and for the strong and moderate signal cases. For the weak signal case, 
although the localization is still inferior compared to WMJl and the Bayesian algorithm, 
the bias appearing in Fig. [^is gone and the distribution becomes quite uniform. 



Fig. 11.— In each panel, blue circles show the estimated sky locations of the source, which are 
obtained from the MaxPhase algorithm for a PTA consisted of 17 pulsars. A red star marks 
the true location of the source used in the simulation. The x-axis represents Right Ascension 
and the y-axis represents the declination. The number of independent data realization is 
100 . 
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5. Summary and conclusions 


Combined with WMJl, this paper completes the hrst step in the program of implement¬ 
ing a purely Frequentist detection and parameter estimation approach for continuous wave 
GW signals using PTAs. There exists a dichotomy in how a GLRT can be implemented 
for this problem and this paper addresses the approach where pulsar phases are treated as 
extrinsic parameters that are maximized semi-analytically. Maximizing over pulsar phases 
is attractive compared to the alternative where they are treated as intrinsic parameters be¬ 
cause the GLRT becomes scalable with the size of a PTA. The maximization over the pulsar 
phases leaves behind a 7-dimensional numerical optimization problem irrespective of the 
number of pulsars in a PTA. We hnd that the latter problem is effectively handled using 
PSO, as was the case in WMJl, without requiring much tuning. Computational costs of 
PTA data analysis methods will become especially important for analyzing the IPTA data 
set that includes about 50 pulsars. 


The approach based on the analytical maximization over pulsar phases has the merit 
that it does not involve the type of constrained maximization that appeared in WMJl. This 
greatly simplifies the implementation and boosts the computation speed of the method. 
However, our results indicate that the performance of the method is not as good as far as 
estimation of the source location and some of the other angular parameters is concerned. 
The increased errors appear to stem from secondary maxima. The fact that these secondary 
maxima disappear when the PTA size is increased, suggests that they are likely to be the 
result of not taking the ill-posedness of the GW network analysis problem - well known in 
the context of ground based detector networks (Klimenko et ah 2005 Mohanty et ah 2006) 
- into account. 


Mitigation of ill-posedness can be achieved by regularization of the inverse problem in 
some form (Greville 1959; Tikhonov & Arsenin 1977 Rakhmanov 2006). However, unlike 
ground based networks of large-scale detectors, we have the simple option in the case of PTAs 
to increase the number of independent detectors (i.e., pulsars). In fact, the NANOGrav col¬ 
laboration is adding 3-4 new MSPs, discovered from the ongoing major pulsar surveys at 
Arecibo Observatory and Green Bank Telescope (e.g., PALFA and GBNGG), in the observa¬ 
tion campaign every year. As known for the ground-based case, this should reduce the effect 
of ill-posedness. That this is so is shown explicitly by taking a PTA with a larger number of 
pulsars. However, although increasing the number of pulsars is an obvious way to mitigate 
the problem of ill-posedness, the results for the weak signal case -the realistic one for the 
current PTAs- show that it cannot be completely ignored and must be addressed properly. 
We leave a deeper look at the problem of ill-posedness and regularization to future work. 


The results reported here were obtained under the following limitations. The simu- 
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lated data was evenly sampled whereas real data will have irregular sampling. However, our 
method works entirely in the time domain, and no major changes are needed to accommo¬ 
date irregularly sampled data. In fact, if the irregularly sampled data have identically and 
independently distributed noise samples, no change in the algorithm is required. If, as some 
studies point out, the noise is not Gaussian or stationary, the actual covariance matrix for 


the given data will need to be modeled (or estimated) (Wang & et ah 

2015 

Wang 

2015). 

Regarding non-Gaussianity, it is worth noting that Finn ( 

2001 

) shows that coherent tech- 


niques, such as MaxPhase and WMJl, are generally robust against non-Gaussianity in the 
noise components. 


The timing residuals for real data are obtained by fitting, using weighted least squares, 
a timing model to the data and subtracting it out. The timing model contains a set of 
parameters specific to the pulsar whose pulse arrival times are being fitted. The fitting 
procedure can affect the signal form as well as the statistics of the noise in the residual. 
When analyzing observational data, a common practice is to use the projection matrix R 
suggested by Demorest et ah (2013). A nice feature of R is that it only depends on the 
fitting model and the weighting matrix used, not the data itself. The influence of fitting can 
be easily taken into account by operating R on the timing residuals in the algorithm. 


In constructing the GLRT, we assumed that the noise parameters are known a priori or 
can be estimated independently of the GW analysis. A more sophisticated approach would 
include the noise parameters as part of the estimation procedure. Since these additional 
parameters would be intrinsic in nature, directly including them in the GLRT would in¬ 
crease the search space dimensionality for PSO significantly. For example, the number of 
dimension increases from 7 to 52 for a PTA with 9 pulsars. Although such large dimensional 
optimization problems appear frequently in the PSO literature, it remains to be seen how 
the increase in dimensionality will pan out in the case of PTA data analysis. Some dimen¬ 
sional reduction scheme, of which hxing the noise model parameters a priori is an extreme 
example, will probably need to be implemented. 


Finally, our signal model does not include the ellipticity or the evolution of binary orbit 
during the period of observation. However, these modihcations will only lead to a few more 
intrinsic parameters that are specific to the GW signal and not associated with the pulsars. 
A study of the GLRT approach for more sophisticated signal models will be carried out in 
future works. 
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